-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
gh-140232: Do not track frozenset objects with immutables #140234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Mikhail Efimov <[email protected]>
|
Maybe it is worth to change PyObject *
PyFrozenSet_Alloc(PyTypeObject *type, Py_ssize_t nitems)
{
PyObject *obj = PyType_GenericAlloc(type, nitems);
if (obj == NULL) {
return NULL;
}
_PyFrozenSet_MaybeUntrack(obj);
return obj;
} |
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me.
…cpython into frozenset_immutable_tracking
Modules/_testcapimodule.c
Outdated
| PyObject *set = NULL, *empty_tuple=NULL, *tracked_object; | ||
|
|
||
|
|
||
| tracked_object = PyImport_ImportModule("sys"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just create empty list or dict here? Importing module seems too heavy for testing purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree a module seems heavy, but we cannot add a list or dict to a set. A custom class or namedtuple would also do, but they require more code to setup. But any suggestion for a simple-to-create hashable object tracked by the GC is welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ough, I forgot about the requirement to be a hashable. Thanks!
WDYT about exposing function for PySet_Add to _testcapi and write tests in python like I did this https://github.com/python/cpython/pull/140132/files#diff-70eaebed435342e02ba8f7f5a84e4eebd552438ce6ac2765e80abb5514bdea03R134?
Then you can write test like:
class Test:
pass
fs = pyset_add(Test())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the reply to victor below. Just adding pyset_add will not work. We could add a _testcapi.test_pyset_add(tracked_item), but then we have the test functionality spread over both the python and c side.
|
Would it be possible to write tests in Python rather than in C? |
I tried, but it is not easy. We have to expose Line 2778 in d78d7a5
And when calling |
IIUC, if you return the first argument from pyset_add then you can test it on the python side. |
Ok, I gave it another try. The first attempt failed, but by using the vectorcall convention I can keep the reference count at 1 also from the Python side. |
| Py_ssize_t pos = 0; | ||
| setentry *entry; | ||
| while (set_next((PySetObject *)op, &pos, &entry)) { | ||
| if (_PyObject_GC_MAY_BE_TRACKED(entry->key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use faster _PyType_IS_GC(Py_TYPE(entry->key)) as in maybe_tracked from Objects/tupleobject.c?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure performance matters a lot, but I would prefer to have it consistent with what is used in tupleobject.c. Unless there are objections, I will change the implementation to use the maybe_tracked.
Co-authored-by: Mikhail Efimov <[email protected]>
…cpython into frozenset_immutable_tracking
| return make_new_set(type, iterable); | ||
| } | ||
|
|
||
| void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to explain the purpose of this function with a link to the GitHub issue.
Objects/setobject.c
Outdated
| void | ||
| _PyFrozenSet_MaybeUntrack(PyObject *op) | ||
| { | ||
| if (op == NULL || !PyFrozenSet_CheckExact(op)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not untracking frozenset subtypes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make a cycle using frozenset subtype:
>>> class F(frozenset):
... pass
...
>>> f = F([1,2,3])
>>> f.cycle = fSo, we need to track them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. In this case, please add a comment explaining that :-)
| # Test the PySet_Add c-api for frozenset objects | ||
| assert _testcapi.pyset_add(frozenset(), 1) == frozenset([1]) | ||
| frozen_set = frozenset() | ||
| self.assertRaises(SystemError, _testcapi.pyset_add, frozen_set, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the second test fails, whereas the first succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second test fails because the argument frozen_set is not uniquely referenced. The error is raised here:
Line 2777 in ce4b0ed
| (!PyFrozenSet_Check(anyset) || !_PyObject_IsUniquelyReferenced(anyset))) { |
I will add a comment to the test
Co-authored-by: Victor Stinner <[email protected]>
In the PR we untrack frozen tuples for the normal constructors. There are a few methods shared between the
setandfrozenset(for exampleset_intersectioninsetobject.c) where we have not added the untracking. (this is possible, but I am not sure this is worthwhile to do).Here is a small script to test the idea:
It measures the performance of garbage collection, and outputs some statistics for the numbers of frozen containers.
Main:
PR